enhancements: Update monitoring alerting consistency proposal#897
enhancements: Update monitoring alerting consistency proposal#897openshift-merge-robot merged 1 commit intoopenshift:masterfrom bison:alerting-consistency
Conversation
|
cc: @openshift/sre-alert-sme @openshift/openshift-team-monitoring |
|
|
||
| The group of critical alerts should be small, very well defined, highly | ||
| documented, polished and with a high bar set for entry. This includes a | ||
| mandatory review of a proposed critical alert by the Red Hat SRE team. |
There was a problem hiding this comment.
We should add details on how and where this happens? How does someone contact the SRE team for this review?
|
/lgtm Thanks for doing this! |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sdodson The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Ping @michaelgugino and @sdodson -- Happy to get more feedback here, but not sure how long to wait. |
|
Great effort, thank you for looking into this @bison. /lgtm |
|
@bison we've waited long enough, thanks |
|
Thanks everyone. Of course, if there ends up being more feedback, please reach out to me. We can keep iterating on this. |
The [Alerting Consistency][1] enhancement, and the proposed updates to it in [openshift/enhancements openshift#897][2], define a style-guide for the alerts shipped as part of OpenShift. This adds a test validating some of the guidelines considered required. [1]: https://github.com/openshift/enhancements/blob/master/enhancements/monitoring/alerting-consistency.md [2]: openshift/enhancements#897
This enhancement proposal was initially added in #637. There was lots of discussion at the time, and it was decided at some point that it would be merged as-is to provide a starting point.
While the original proposal absolutely provided a great jumping-off point to the discussion around issues with alerting in OpenShift, the monitoring team would like to see it now evolve into a more implementable enhancement proposal focused on developing a practical style guide for OpenShift alerts. We'd also like to see the acceptance criteria for new critical alerts in OpenShift formalized here. This is a first pass at reshaping the original proposal into something in this direction.